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[57] ABSTRACT 

A high-speed large quantity data transfer system is realized 
using a synchronous control means having striped disks in 
which a synchronizing cost between disks is reduced by 
software control, using time-out means, and using a data 
transfer means where user memory is not used. An input- 
output processing system for inputting and outputting a large 
quantity of data comprising a logical disk control means, the 
logical disk control means comprises a performance data 
collection means for collecting data, a logical disk construc- 
tion means for constructing a logical disk apparatus using 
said plurality of disk apparatus, where said logical disk 
construction means forms logical disk management data so 
that the stripe width is set in order to equalize response time 
needed for input and output corresponding to one stripe data 
of each disk apparatus constructing said logical disk appa- 
ratus; and said logical disk control means controls said 
logical disk apparatus by said logical disk management data. 

18 Claims, 9 Drawing Sheets 
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APPARATUS FOR FORMING LOGICAL 
DISK MANAGEMENT DATA HAVING DISK 
DATA STRIPE WIDTH SET IN ORDER TO 
EQUALIZE RESPONSE TIME BASED ON 

PERFORMANCE 3 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates to an input-output process- 
ing system in a computer system, in particularly to an 10 
input-output processing system far inputting/outputting a 
large amounts of data in a high-speed. 

2. Description of the Prior Art 

In recent years, a disk subsystem called Redundant Array 
of Inexpensive Disks (RAID) spreads in personal computers 
and workstations, according to miniatuxization and inclina- 
tion for low prices of magnetic disk storage devices. RAID 
is a logical disk apparatus comprising a plurality of disk 
apparatus in order to improve performance and reliability. In ^ 
general, any of construction systems divided into six levels, 
from RAID 0 level to RAIDS level, is adopted according to 
the purpose of use. Generally. RAID is realized as a function 
of disk controller and it is often realized as a function of 
SCSI host adaptor in a personal computer. J$ 

On the other hand, it is also known that a logical disk 
apparatus can be constructed by a plurality of disk apparatus 
controlled by a software instead of a hardware. Typical 
products using such technology are. such as. Logical Volume 
Manager of IBM company and Volume Manager of Veritas ^ 
company. These products have functions such as disk 
concatenating, disk mirroring and disk striping. Some prod- 
ucts are attempting to cover all of RAID functions. 

In RAID system. RAID 3 level or a RAID 0 level is used 
in order to input and output rapidly a large quantities of data. 35 
A RAID 3 level is constructed as a byte interleave construc- 
tion in which data layout construction has parity data to be 
added. RAID 0 level is equivalent to the disk striping by 
software where the parity data is Dot added to. 

Two input-output performances could be expected by « 
adopting disk striping and RAID. One improvement of 
input-output performance for users which is provided by 
allocating input-output requests to respective disk apparatus 
and operating these disk apparatus in parallel when a data 
size of input-output is requested which could contain a 45 
plurality of disk apparatus for interleave-arranged data. 
Another is an improvement of a through-put performance of 
a system which is provided by distributing the concurrent 
access to logical disk apparatus of a plurality of disk 
apparatus. In order to input and output rapidly a large 30 
quantities of data, the former improvement is pursued in the 
present invention. 

By the way. although input and output for the logical disk 
apparatus constructed by stripe are distributed to respective 
disk apparatus, it is necessary to synchronize all inputs and 55 
outputs distributed to the logical disk apparatus for every 
input and output access. If there is a difference of processing 
time between respective logical disk apparatus operating in 
parallel, a waiting time is needed to finish the processing 
according to the difference. Therefore, it causes deteriora- 60 
tion of input-output performance of a logical disk apparatus. 
For factors of disturbing the synchronization, there are 
performance differences of respective disk apparatus, a seek 
(positioning), dynamic performance differences caused by 
rotational waiting and que of data bus. and so on. « 

In order to decrease overhead of waiting time between 
disk apparatus in RAID system, for example, in a system to 
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which RAID 3 level is applied, it is known that the system 
is constructed by homogeneous disk apparatus and all disk 
rotations are synchronized so as to avoid the disturbance of 
synchronization caused by rotational waiting of disks, and 
that sector positions which arrange data arc shifted little by 
little so as to absorb a delay of input-output instructions to 
respective disk apparatus. However, this technique requires 
an assumption that RAID could be designed as a single 
hardware subsystem (MICHELLE Y. KIM. "Synchronized 
Disk Interleaving". IEEE TRANSACTIONS ON COM- 
PUTERS. VOL. C-35. NO. II. NOVEMBER 1986) 

Further, from more precise point of view, a prior an is 
disclosed in the laid-open Japanese patent publication No. 
5-27910. where location gap correction of head of disk 
apparatus is regarded as a factor of disturbing synchroniza- 
tion between disk apparatus. In the prior art when a location 
gap correction is needed in a certain reference disk, com- 
mands are instructed to every disk apparatus comprising 
disk arrays to perform location gap correction in order to 
minimize synchronization disturbance caused by location 
gap correction. 

In the laid-open Japanese patent publication No. 
5-257611, it is disclosed that a logical disk apparatus is 
realized in a flexible construction by software control in a 
RAID system. In other words, there is disclosed a data 
layout method, where logical disk apparatus having different 
RAID levels are mixed flexibly on one or a plurality of disk 
arrays, for partitioning of disk array, and performance dete- 
rioration by disturbing synchronization between disk appa- 
ratus is decreased. 

As described above, a logical disk apparatus constructed 
by software control has been disclosed in the prior arts. But 
any logical disk apparatus has not been constructed based on 
a performance data obtained by measuring the disk appara- 
tus comprising a logical disk apparatus in order to construct 
the most suitable logical disk apparatus. 

The present invention provides an efficient input-output 
processing system to improve a through-put of the entire 
system. More concretely, the object of this invention is to 
improve the performance of input-output system by con- 
structing the most suitable logical disk apparatus using 
ordinary disk apparatus instead of using RAID. 

In processing a large quantities of data such as image data, 
input-output data quantity at one time becomes very large. In 
this case, the invention provides an input-output processing 
system which receives less data than specified quantity and 
can start its processing earlier, instead of waiting for all 
specified large quantity of data being transmitted entirely in 
response to one input -output instruction. 

Further, the invention provides an input-output system 
which makes it possible to transmit data efficiently between 
input-output units. 

SUMMARY OF THE INVENTION 

According to one aspect of the invention, an input-output 
processing system for inputting and outputting a large 
quantity of data consisted of a logical disk control means, 
where the logical disk control means includes a performance 
data collection means for collecting data from performance 
data where performance characteristics of a plurality of disk 
constructing input-output system are given by a system 
manager or from direct measurement of the performance by 
operating the disk; a logical disk construction means for 
constructing a logical disk using all disks on the basis of 
performance data collected by the performance data collec- 
tion means, where the logical disk construction means 
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produces logical disk management dau which sets width in 
order to equalize response tune needed for input and output 
corresponding to one stripe data of each disk constructing 
the logical disk; and the logical disk control means controls 
the Logical disks by the logical disk man age meat data. 

According to further aspect of (he invention, the input- 
output processing system for inputting and outputting a large 
quantity of data, further comprises a time limit setting means 
for setting a limit time required to operate the input -output 
unit at sending input-output instruction to input-output units; 
and a means for completion of input-output operation which 
is started by the input-output instruction when the set time 
limit passed. 

According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data of ciaira 2. further comprises 

a timer for setting time set by the time limit setting means; 
a means for completing input-output operation by 
receiving time expiration information from the timer. 

According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data: wherein the input-output unit is a logical 
disk apparatus comprised of a plurality of disk apparatus. 

According to further aspect of the invention, the input- 
output processing system for Inputting and outputting a large 
quantity of data, further comprises a time limit setting means 
for setting Limit time in relation to a processing of an 
input-output instruction at sending input-output instruction 
to input-output units; and a processing completion means for 
completing input -output operation which is started by the 
input-output instruction when the set time limit is expired. 

According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data: wherein the performance measurement 
means performs an input-output instruction to measure the 
response time by setting conditions in which the response 
time for each disk apparatus is the shortest or longest. 

According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data: wherein the performance measurement 
means carries out performance measurement of the disk 
apparatus as a part of initialization process to the disk 
apparatus when disk apparatus are added to the system. 

According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data: wherein the performance data collection 
means collects bus performance data by construction of 
input-output bus which a plurality of disk apparatus are 
connected to, system construction data to which the system 
manager gives bus transfer performance, or actual operation 
of an input-output unit connected to the bus by the perfor- 
mance measurement means; the logical disk construction 
means constructs a logical disk apparatus on the basis of this 
collected performance data considering bus transfer perfor- 
mance connected to each disk performance. 

According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data: wherein in the logical disk construction 
means, data transfer quantity for every input-output Instruc- 
tion assigned to disk apparatus comprising logical disk 
apparatus is arranged in one track of the disk apparatus. 

According to further aspect of (he invention, the input- 
output processing system for inputting and outputting a Large 
quantity of data: wherein the Logical disk construction means 
equalizes apparent input-output performance of a logical 
disk apparatus no matter where the data is placed on the disk 
apparatus, by alternate combination of inner and outer of 
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circumferences of respective disk apparatus, if the logical 
disk apparatus is constructed by a plurality of homogeneous 
disk apparatus whose performance is different between inner 
and outer of circumferences. 

5 According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data: wherein when an input-output request is 
sent to a logical disk apparatus, the logical disk control 
means dynamically judges the state of head position of disk 

)0 apparatus shortly before sending an input-output instruction 
to every disk apparatus constructing the logical disk 
apparatus, and sends the input-output instruction so that a 
time for carrying out input and output to the logical disk 
apparatus becomes shortest 
According to further aspect of the invention, the input- 

1 5 output procc sstng system for inputting and outputting a large 
quantity of data: wherein when an input-output request is 
sent to a logical disk apparatus using input-output size which 
a plurality of input-output requests generates to each of a 
plurality of disk apparatus; the logical disk control means 

20 judges performance characteristics of disk apparatus com- 
prising this logical disk apparatus and characteristics of 
input-output bus which is connected to a disk, and decreases 
frequencies for sending a plurality of input-output instruc- 
tions to the same disk apparatus, according to the necessity. 

25 According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data: wherein when an input-output request is 
sent to a logical disk apparatus using input-output size which 
a plurality of input-output requests generates to each of a 

30 plurality of disk apparatus; the logical disk control means 
sends the input-output instruction so that the input-output 
time to the logical disk apparatus becomes shortest by 
arranging a synchronizing point in the input -output request 
to these plurality of disk apparatuses. 

33 According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data: wherein the logical disk control means 
automatically constructs a Logical disk apparatus which 
satisfies the performance given by the system manager. 

40 According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
quantity of data, further comprises a data transfer driver for 
carrying out data transfer control to input-output units, the 
data transfer driver carries out data transfer by ensuring 

43 two-face buffer on the system memory, when an apparatus 
for sending the input-output instruction includes a logical 
disk apparatus. 

According to further aspect of the invention, die input- 
output processing system for inputting and outputting a large 

50 quantity of data, further comprises a bus arbiter for control- 
ling input-output bus connected to an input-output unit; the 
bus arbiter comprises a source register and a destination 
register for registering data transfer source ID and data 
transfer destination ID. respectively; and the data transfer 

55 driver carries out data transfer without using system memory 
by driving the bus arbiter when an apparatus for sending the 
input-output instruction does not include a logical disk 
apparatus. 

According to further aspect of the invention, the input- 
60 output processing system for inputting and outputting a large 
quantity of data: wherein the data transfer driver drives the 
bus arbiter by ensuring buffer for data transfer on the system 
memory, when mere is difference of data transfer rate 
between input units and output units which are appointed by 
65 an input-output instruction. 

According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large 
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quantity of data: further comprising a plurality of disk apparatus may be connected to each SCSI bus 112 through 

apparatus comprising the logical disk apparatus which arc a disk controller 110. 

connected to different input -output buses to construct a FIG. 2 shows a software construction of the present 

plurality of logical disk apparatus, further comprising a copy invention. In FIG. 2. a control program 201 carries out data 

means arranged between disk apparatuses connected to the 3 transfer from a logical disk apparatus to high-speed device 

same input-output bus at a disk control apparatus for coo- 108, or adversely, from high-speed device 108 to disk 

trolling respective disk apparatus connected to the respective apparatus Ul or logical disk apparatus. This control pro- 

inputnoutput buses, in order to carry out data copy between gram 201 operates as a user application program. Jfa this 

differenUogical disk apparatus eirJxxliment. an operating system 202 is mainly used for a 

°^ 10 frame work, and functions of its operating system are hardly 

BRIEF DESCRIPTION OF THE DRAWINGS used. Control program 201 substantially requests a data 

, , , - transfer driver 203 to control the data transfer. In other 

FIG. 1 shows an example of a hardware construction of a words mc control 201 only instructs to write data 

system which an input-output processing system of the of 30 MB from an offset 1 MB of device A to device B. Data 

present invention is applied to. transfer driver 203 is incorporated into a kernel as a pseudo 

FIG. 2 shows another example of a software construction driver. A volume manager 204 constructed by a logical disk 

of a system which an input-output processing system of the apparatus using a plurality of disk apparatus by software 

present invention is applied to. control, which excludes an overhead, and obtains shorter 

FIG. 3 shows an example of performance data of a disk response time, in other words, more high-speed and a large 

provided by a performance measurement means of the M quantities of data transfer, or a confirmed response in a 

present invention. certain limited time (this volume manager 204 constructs a 

FIG. 4 shows a flowchart of processes to construct a part of a logical disk control means). A disk driver 205 (a 

logical disk apparatus automatically which is carried out by part of a logical disk control means) controls disk apparatus 

a volume manager of the present invention. 111. A device driver 206 (a part of a logical disk control 

FIG. S shows an example of logical disk management data M means) controls the high-speed device 108. When a data 

of the present invention. transfer function of bus arbiter 105 is utilized, data transfer 

FIG. 6 shows an example of logical disk construction data **« «*ri« «« directly. The logical 

of the present invention. 0001101 mean$ CODtrols QOt a lo B lcal apP**" 5 

™~. - . . c « • ■ t but also input-output units other than a logical disk appa- 

FIG. 7 shows a construction of a disk controller of the _ 1 T 

nt * 30 rams. 

If V ^ 0D " , , ^1*1 Here, a function of volume manager 204. which is the 

FIG. 8 shows an example of operation status of a logical most ^ ncdoti of mc Fcscnt invention, is 

disk apparatus of the present invention. explained as follows. Volume manager 204 provides a 

FIG. 9 shows a construction of a bus arbiter of the present logical disk apparatus having larger consecutive data area 

invention. 3S t^at of an actual disk apparatus by constructing a logical 

FIG. 10 shows a flowchart of processes of data transfer disk apparatus comprised of a plurality of disk apparatuses, 

driver of the present invention. provides a disk mirroring where the data being written to a 

FIG. 11 shows a flowchart of processes of bus arbiter of logical disk apparatus are written to a plurality of disk 

the present invention. apparatus, and also provides high-speed performance of 

40 input-output by a stripe construction of the logical disk 

DETAILED DESCRIPTION OF THE apparatus. In other words, volume manager 204 constructs a 

PREFERRED EMBODIMENTS • logical disk apparatus using disk apparatus under the control 

EMBODIMENT I of vomme manager 204 according to the instructions of a 

system manager. A user accesses the logical disk apparatus 
FIG. 1 shows a construction of hardware system which A5 without any direct access to the disk apparatus controlled by 
the present invention is applied to. In FIG. 1. a CPU (central volume manager 204. Naturally, a system manager need not 
processing unit) 101 carries out processing of the whole put every disk apparatus under the control of volume man- 
system, a memory 102 stares instructions and data which a g CT 204. 

CPU 101 processes, and a system timer 103 measures the Further explanation as to volume manager 204 Is as 

time. These units are respectively connected to a memory M f 0 u ows . Volume manager 204 comprises a disk introduction 

bus 104. means 207 which introduces a disk apparatus into a system. 

Memory bus 104 is connected to a local bus (main a disk management data 210 which is composed of a disk 

input-output bus) 107 through a bridge 100. This local bus label* which is read by disk introduction means 207 from a 

107 is controlled by a bus arbiter 105. Four SCSI bus disk apparatus when a disk is introduced, and a disk attribute 

adapters 109 each having DMA controller and a high-speed 53 file of the system, a performance measurement means which 

device 108 are connected to Local bus 107. where die SCSI measures performance of a disk apparatus which is started 

bus adaptor 109 is connected a SCSI bus 112, respectively. by the disk introduction means, a performance data 209 

This high-speed device 108 has a DMA interface. For measured by the performance measurement means, a logical 

example, the high-speed device 108 instantaneously pro- disk construction means 213 which generates a logical disk 

cesses a large quantity of data transmitted from a disk ^ construction data from the performance data 209 and the 

apparatus, like a sort processor, and outputs the processing disk management data 210, and a logical disk input-output 

result again to the disk apparatus. interface which conducts interface when accessing the logi- 

Further. four disk apparatus 111 having the same con- cal disk apparatus using logical disk construction data 

structions are connected to each SCSI bus 112 through a disk generated in the logical disk construction means 213. 

controller 110. « Here, a technique called disk striping is summarized as 

In this embodiment, the same disk apparatus are used as follows. To make it easily understood, assuming that two 

disk apparatus 111. but oaturally. different kinds of disk completely equivalent disk apparatus are connected to an 
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input-output bus which is high-speed enough. Consecutive 
large quantities of data having a certain size is divided to be 
a size of. far example, 128 sectors (this quantity of data unit 
is called 'chunk', hereafter). Chunks are stored alternately in 
respective disk apparatus in order from the first When an 
input-output instruction for two-chunk input-output size is 
requested at the top of logical disk apparatus constructed by 
stripe, the input-output request is distributed into two disk 
apparatuses and processed. Therefore, apparent input -output 
performance becomes double. 

It is explained in the following example that a system 
manager needs a disk apparatus whose performance is twice 
as good as a usual disk apparatus, and realizes it by utilizing 
two usual disk apparatuses and the volume manager 204. A 
system manager introduces the volume manager 204 into a 
system. The volume manager 204 gives an instruction for 
introducing two new disk apparatus to the disk introduction 
means 207. and registers it under control of volume manager 
204. Receiving the request for registration, the volume 
manager 204 initializes the disk apparatus 111 and writes a 
management metadata (logical disk management data 21 1) 
to the disk apparatus 111 for constructing a logical disk 
apparatus. At this state, a logical disk apparatus is not 
constructed yet Further, the system manager requests the 
logical disk construction means 213 to construct a logical 
disk apparatus constructed by stripe in the volume manager 
204 using two disk apparatus. Logical disk construction 
means 213 forms logical disk management data 211. 
thereby, a user is enabled finally to utilize a logical disk 
apparatus. When a user accesses to the logical disk 
apparatus, he accesses to a logical disk input-output inter- 
face 212 through the operating system 202. Receiving the 
input-output request volume manager 204 refers to logical 
disk construction data in the logical disk apparatus, and 
gives an input -output instruction to an actual disk apparatus 
111 which comprises a logical disk apparatus. 

Operations of this input-output processing system are 
explained orderly below. To begin with, a performance 
measurement of a disk apparatus unit and a measurement of 
bus efficiency are explained, which are carried out when a 
disk apparatus is introduced into a system. 

First, they arc summarized as follows. These measure- 
ments carry out assumption of response performance 
according to the input-output size of disk apparatus, 
response performance according to disk address and disk 
cash effect and so on. by using performance measurement 
means 208 (a part of performance data collecting means), to 
know performance features of respective disk apparatus 
constructed by stripe. Concretely, input-output operation of 
the disk apparatus is carried out using the parameters, and a 
response time (a part of performance data 209) is measured 
and the result is recorded. On the basis of performance data 
209 of the respective disk apparatus, data quantities which 
should be allocated for respective disk apparatus, mat is. a 
stripe width is determined in order to make each response 
time equal which is required for input-output data of one 
stripe of disk apparatus comprising a disk stripe. According 
to the resultant data, the logical disk construction means 213 
constructs the logical disk construction data 211. that is. a 
logical disk apparatus of striping construction. Input -output 
requests to the logical disk apparatus of such a construction 
are distributed to respective disk apparatus comprising a 
stripe. Respective input-output units equally responses in 
every certain tune. Therefore, roughly speaking, apparent 
response time is reduced to a ratio of one to the number of 
disk apparatus constructing stripes, in comparison with the 
response time for merely mputting-outputting to disk 
apparatus. 
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volume manager 204 judges whether requested perfor- 
mances of the logical disk apparatus are ensured or not 
when a logical disk apparatus construction is requested by a 
system manager, on the basis of this measured performance 
data 209. Introduction of disk apparatus is carried out by a 
instruction of format (initialization), after starting up the 
system after disk apparatus is connected, and then after 
confirmation of disk connection. By using format data as 
writing data, disk performance measurement and initializa- 
tion of a disk apparatus can be both carried out It does not 
carry out a mere format but also metadata (disk management 
data 210) is written so that the disk apparatus is controlled 
under the volume manager 204. FIG. 3 shows a result of 
performance measurement which is carried out to disk 
apparatus 111. The numerals in FIG. 3 show the above 
mentioned metadata which are stored in the disk apparatus. 
In other words, they are re ferr ed to as a snapshot. 

In FIG. 3. a reference name 501 specifies the disk appa- 
ratus 111 in a system, index information 502 specifies the 
type of disk apparatus 111. The volume manager 204 obtains 
this information by reading out the data base, given by a 
system manager, which are included in catalogue data 
relating to disk apparatus 111. or by reading out label 
information of disk apparatus Ul. This information tells 
disk rotation speed and the number of sectors per one truck 
on each cylinder. In FIG. 3. the numeral 503 denotes 
capacity of disk apparatus 111. the numeral 504 denotes an 
access time without seek and rotational waiting time, and the 
numeral Sf5 denotes an access time indicated in micro 
second when seek is carried out from the far most distant 
position with one rotational waiting time. This measurement 
is carried out by measuring the pass time between the 
completion of access to the reference position and the 
completion of access to the next sector or the far most distant 
sector. Ten data from 506 to 507 expresses bit numbers in 
KB/S which are obtained by measuring transfer rate at the 
positions from zero cylinder in every 100 MB. These disk 
performance data 209 Is referred to or utilized when logical 
disk is constructed as explained later. It is assumed mat there 
Is no difference between reading performance and writing 
performance for the disk apparatuses which are used in this 
embodiment. 

In general, when a new disk apparatus is introduced into 
a system, a process called format is carried out for finding 
bad sectors and writing label information. In this 
embodiment this format process may be carried out at the 
same time with a process for collecting performance data 
209 of a disk apparatus by performance measurement means 
208. as described above. Therefore, necessary cost for 
measurement performance could be reduced, and a burden 
of system manager could also be reduced. 

The following is the explanation about measurement of 
through-put performance of an input-output bus. The object 
of this measurement is to construct a logical disk, consid- 
ering an input-output performance of disk apparatus, a data 
transfer performance (a part of performance data 209) of 
input-output bus which could be a factor to prevent syn- 
chronization operation, and a conflict on the same bus. 
Input-output units including all disk apparatus respectively 
connected to a plurality of input-output buses are combined 
to be driven and be loaded. The response time of input- 
output operation of each input-output unit is measured. 
Thereby, it is known if there is any redundant ability far 
transferring data at each input-output bus and also at higher 
hierarchy input-output buses to which the former input- 
output bus is connected, or what the limit value of the 
transfer ability is. A logical disk apparatus is constructed by 
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selecting a disk apparatus constructed by stripe, in order to of stripes and stripe width. In other words, performance 

avoid a data transfer ability limitation of common input- improvement by disk striping can not be obtained if an 

output bus by a data transfer of disk apparatus operating in input-output size is not a multiple of the above mentioned 

parallel. Thereby, performance deterioration caused by chunk size. 

input-output bus neck when data are inputted and outputted 5 In step 602. the necessary number of disk apparatu s which 
to a logical disk apparatus, can be avoided. operate in parallel to achieve performance of an appointed 
Accordingly, this measurement defines not only a mere logical disk apparatus, in other words, the WAY number of 
through-put performance but also conditions of input-output stripe is determined, according to the performance data 209 
bus conflict generation to an input-output unit connected to of disk apparatus obtained by the performance measurement 
a measurement object bus according to the starting order. [0 means 24S. After the determination of the WAY number, 
and actually generates a bus conflict to measure input-output vacant areas on the disk apparatus which is controlled by this 
latency of each input-output unit Thereby, it is possible to input-output processing system is searched at step 603. and 
know bus arbitration policy and to insert a starting delay in determines whether it is possible to construct stripes on the 
order to avoid unnecessary bus conflict generated at starting disk apparatus which is different by necessary WAY num- 
the input-output of disk apparatus comprised of the stripes. 1; bers. However, disk striping is not always needed in order to 
In mis embodiment, however, through-put performance of satisfy the appointed performance reference. If it is not 
SCSI bus 112 and local bus 107. to which a disk apparatus necessary, vacant areas are simply searched on the disk 
is connected, is measured, and the stripe is constructed not apparatuses. After then, at step 604. it is judged by the given 
to exceed the limit of the through-put performance of each input-output size if there is any effects in performance, or if 
input-output bus at accessing input and output of the logical ^ the requested performance reference does not need any disk 
disk apparatus comprised of stripes. Regarding SCSI bus striping, when constructing a logical disk apparatus having 
112, the limit value of the through-put performance can be stripe construction, whose chunk size is equal to the input- 
obtained by running all input-output units connected to the output size. In case of no effect in performance is found by 
bus 112 (the same four disk apparatuses are connected to the constraint of stripe width, or no disk striping is 
SCSI bus 112 in the same way in this embodiment), and by ^ necessary, disk areas are reserved and an expected perfor- 
calculating total amount of input-output data obtained within mance of logical disk apparatus is reported to a user at step 
a certain period of time. By comparing the total amount 610 before completion. 

obtained in the above processes with the through-put per- When disk striping of more than two WAYs is effective, 

formancc in case that one disk apparatus operates, we know it is judged at step 605 if there is any possibility of stripe 

how many disk apparatuses can operate without being construction by the same kind of disk apparatus. If it is 

limited by bus transfer rate. In this ernbodirnent, since the possible, it is judged at step 606 if mere is any performance 

maximum transfer rate of the disk apparatus is 2 MB/s and difference between inner and outer circumferences of disk, 

the maximum transfer rate of SCSI bus 112 is 8 MB/s. the if there is performance difference between Inner and outer 

operation is not simply limited by bus transfer rate if the four circumference of disk, it is judged at step 607 if (here is any 

disk apparatus connected to SCSI bus 112 operate at the 33 possibility of using disk apparatus, which constructs stripes, 

same time. from the same offset position. If it is possible to use from the 

Similarly, according to measurement of transfer rate of same offset position, disk apparatus are constructed by stripe 

local bus 107 by operating every disk apparatus and the alternately to the reverse direction at step 608. If the result 

high-speed device IIS. it is known that the transfer rate is no as indicated **n" at step 605. step 606 and step 607. the 

corresponds to the total of transfer rate of all input -output ^ stripe is constructed in order to equalize the response speed 

units at 40 MB/s, Eventually, no matter how input-output of respective disk apparatus by stripe width, at step 608. 

operation is carried out to every input-output unit, a con- Operations in step 609 is further supple mentarily 

struction of an input-output bus and an input-output unit of explained as follows. Disk striping is constructed as follows, 

this embodiment is not limited by transfer rate of the First, chunk size is determined. An input-output size is 

input-output bus. 45 regarded as chunk size unless the appointed input-output 

However, if it is different from this embodiment, for size is larger man the amount of stripe width necessary to 

example, if transfer rate of SCSI bus is slower than the achieve logical disk apparatus performance. On the other 

transfer rate of SCSI bus of this embodiment, and only two hand, when an input-output size is too large, input-output 

disk apparatuses can operate by the transfer rate of SCSI bus size divided by natural number is regarded as chunk size 

at the maximum transfer rate of the disk apparatus, more so according to the necessity. Now. assume that 512 KB is 

than three disk apparatuses which comprise the same logical appointed for input-output size. In this case, for example, 

disk apparatus constructed by stripe could not be constructed chunk size becomes 128 KB. which is a quarter of 512 KB. 

by disk apparatus connected to the same SCSI bus. if it is possible to achieve performance by 4 WAYs of 

Next, a construction method of a logical disk apparatus is average 32 KB stripe width. Data transfer rate at the position 

explained as follows. FIG. 4 shows a flow chart which 53 of disk apparatus constructing stripes is obtained from 

explains operations of a logical disk construction means 213 performance data 209 of disk apparatus which is already 

comprising a logical disk apparatus constructed by stripe. obtained. The chunk is divided by the inverse ratio of the 

The method is explained referring to FIG. 4. hi step 601. a data transfer rate to get the stripe width. Thereby, response 

user, in other words, a system manager inputs necessary time of disk apparatus constructing respective stripes is 

logical disk apparatus performance of stripe construction. 60 averaged when input-output operation is carried out to the 

for example. 8 MB/S . input-output size which an application logical disk apparatus. 

program outputs in case of accessing the logical disk Next, supplementary explanation of operations in step 

apparatus, for example. 512 KB. and necessary size of the 608 is as follows. Disk striping in step 608 is constructed as 

logical disk apparatus, for example. 1 GB. Thereby, this follows. In this case, chunk size and stripe width of each disk 

input-output processing system tries to construct a logical 63 apparatus are also determined first Stripes are constructed 

disk apparatus which satisfies a specification. Here, the alternately from the top and from the end of the same usage 

input-output size is specified in order to restrict the number area of disk apparatus constructed by stripe of even number. 
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The stripe width is equivalent to one track width. The size 
of one track at the head position and the tail position is 
known by the number of sectors in c l u ded in the tracks at the 
position shown by performance information 502 of disk 
apparatus, and chunk size is determined such as (size of the 
head traefc+size of the tail track)x( number of stripe WAYs/ 
2). Accordingly, the stripe width is equivalent to the size per 
track at the time of input and output. 

When logical disk construction data 211 for constructing 
a logical disk apparatus is generated and registered as 
described above, performance of this logical disk apparatus, 
which can be expected to be recommended input -output 
size, in other words, chunk size determined inside, is 
reported at step 61 1 before completion. When disk striping 
is not necessary according to performance level and input- 
output size given beforehand or when improvement of 
performance is not expected, necessary input-output size 
which satisfies the given performance level and a single disk 
performance is reported to a single disk apparatus. 

FIG. 5 and FIG. 6 show partial management information 
of a logical disk apparatus as constructed described above 
and respective stripe-constructed disk apparatus comprising 
a logical disk apparatus. The copies of these management 
information are stored as metadata on every disk apparatus 
111 controlled by this input-output processing system, and 
arc read on memory 102 controlled by this input-output 
processing system according to the necessity. FIG. 5 and 
FIG. 6 show general tabular formats for convenience. But 
these information arc stored on actual disk apparatus and 
memory according to a template controlled by the program. 

A logical disk apparatus constructed according to step 609 
and management information of stripe-constructed disk 
apparatus, shown in FIG. 5 and FIG. 6. are explained as 
follows. FIG. 5 includes a logical volume name 701 of a 
logical disk apparatus, which is accessed by the logical 
volume name 701 from an application program. A volume 
size 702 shows a volume size of the logical disk. A volume 
type 703 shows a volume type of the logical disk which is 
4 WAY stripe construction in the present example. A usage 
region 704 shows a disk apparatus constructing stripes and 
areas of the disk apparatus, which are respectively specified 
by the names. c0t0d0-0. clt0d0-0. c2t0d0-0 and'c3t0d0-0. In 
order to be easily understood, a disk apparatus is specified 
by c0t0d0-0. and an area number is specified by -0. 
However, they arc logical expression, but not identify the 
addresses on the hardware. Further. FIG. 6 shows area 
information of disk apparatus shown by usage region 704. 
mat is. region information specified by index c0t0d0-0. FIG. 
6 includes the region name 801. physical disk name 802 of 
a physical disk apparatus where the region is arranged, 
starting offset 803 of the region, region size 804. data 
arrangement direction 805. stripe width of the region 806. 

Next, an operation is explained when an input-output 
operation is requested to a logical disk apparatus having 
management information (logical disk construction data 
211) shown in FIG. 6 and FIG. 5. When accessing from 
application program (control program 201 is also a kind of 
application program) to the present logical disk apparatus, 
chunk size as input-output size, and an offset value for the 
logical disk apparatus are given. In other words, the appli- 
cation program can inquire of a volume manager 204 about 
chunk size beforehand. When volume manager 204 receives 
the inquiry request it reports attributes, such as chunk size 
of the appointed logical disk apparatus. When this input- 
output processing system receives the input-output request 
the system refers to management data, that is, logical disk 
construction data 211 of the logical disk apparatus, and 
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knows that the logical disk apparatus is constructed by 4 
WAY stripe construction as shown in the usage region 704. 
When the system is informed by the region information that 
this logical disk apparatus is constructed by stripe, the 

s system calculates how many chunks of the stripes corre- 
spond to the offset. For a region arranged reverse direction, 
in other words, for a reverse directional region, an accumu- 
lated total track size which is corresponding to the number 
of offset chunks from the tail region is regarded as an offset. 

to and the track size is regarded as an input-output size Then, 
the system gives an input-output instruction to the respective 
disk apparatus comprising stripes. For a forward directional 
region, offset is calculated similarly from the top of the 
region, and an input-output instruction is given by regarding 

is the size of the track as an input-output size. For all regions 
constructing a logical disk apparatus, input and output 
control is started from the far-distant head position of the 
disk apparatus maintained by the disk driver 205. and input 
and output operation to the logical disk apparatus is com- 

20 pleted after waiting the completion of the all input-output 
operation to the disk. 

As already explained, a truck size of each position inside 
the region is obtained by the database which this input- 
output processing system provides for every kind of disk 

25 apparatus. 

According to this ernbodiment a performance collection 
means collects performance features of disk apparatus con- 
nected to the system, by data given by the system manager 
or by performance measurement means. A logical disk is 

30 constructed by logical disk construction means on the basis 
of the collected data, and the logical disk is operated by 
logical disk control means. Therefore, response time, needed 
to input-output per stripe of data of each disk comprising 
logical disk, is averaged to improve logical disk perfor- 

33 mancc. 

When new disk apparatus is introduced into the system, 
the disk apparatus can collect the performance data 209 by 
performance measurement means 208. combined with for- 

M mat processing which is always carried out to find bad 
sectors and to write label information. Therefore, a burden 
of system manager can be reduced as well as the cost for 
collecting the performance data 209. 
Further, input-output operation of the maximum access 

43 time, the minimum access time, and transfer rate of each 
disk apparatus actually mounted to the system are actually 
carried out to respective disk apparatus. The response time 
is measured, and a logical disk is constructed using a table 
having the measured data. Therefore, the most suitable 

50 logical disk apparatus can be constructed. 

Further, the logical disk apparatus is constructed, consid- 
ering data transfer performance of input-output bus and the 
competition on the same bus which could be factors to 
prevent input-output performance and synchronous opera- 

55 tion of disk apparatus. An operation of the system is carried 
out by using the data transfer performance of input-output 
bus and construction data given by system manager, or by 
combining input-output unit including every disk apparatus 
connected to a plurality of input-output buses by the per- 

60 formance measurement means. Thereby, the system Is 
loaded, and the response time in Input-output operation of 
each input-output unit is measured. According to the 
measurement it Is decided whether there is any margin for 
the data transfer ability in each input-output bus. or in a 

65 higher input-output bus to which the irtput-output bus is 
connected, or what the limit value of transfer ability is. A 
disk apparatus having logical disk construction constructed 
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by stripe is selected, by preventing the limit of data transfer into one input -output instruction. This data chaining firoc- 

ability of the common input-output bus by the data transfer tion is realised, by providing a scattering/gathering function 

of disk apparatus which is operating in parallel, at input and to disk driver 205 and SCSI bus adapter 109 while data are 

output to the logical disk apparatus constructed by stripe. transferred to/from the memory, and using a DMA list 

Therefore, performance deterioration can be avoided, which s Iq a logical disk apparatus explained in the first 

is caused by an input-output bus neck at the input-output embodiment gathering a plurality of input-output insrruc- 

operation for the logical disk apparatus. tions docs not always improve the performance since the 

The data per a chunk constructing stripes of logical disk logical disk apparatus includes a region where data arrange- 

apparatus is arranged to be included in one track of the disk ment direction is reversed to the seek direction. Therefore, 

apparatus. Thereby, a means is provided in order to average w this gathering of input-output instructions is not carried out 

head seek cost (corresponding to the time for positioning) by in the second embodiment. If the logical disk apparatus in 

the sequential access regardless of the seek direction. the first embodiment is a logical disk apparatus constructed 

Therefore, more flexible disk drive could be constructed. by the step 609 explained in FIG. 4. that is. a logical disk 

because the difference becomes small even if seeking opera- apparatus which equalizes response speed between disks by 

tion is carried out from outer to inner of the circumference 15 the stripe width, an input-output bus has enough transfer 

or if seeking operation from inner to outer of the ability to be regarded as advantageous for this embodiment, 

circumference, except disk cash effect at reading out the Therefore, gathering a plurality of input-output instructions 

data. on the same disk apparatus is carried out 

Further, when a plurality of disk apparatuses having 

performance difference between inner and outer of the 20 EMBODIMENT 3 

circumference are constructed by stripe, the stripe is con- third embodiment realizes high-speed processing in 

struct ed so that inner and outer circumference are combined an ^put-output processing system of the present invention, 

alternately. Thereby, uniform input-output response time can mG 7 shows aQ implementation means which is arranged in 

be obtained, regardless of read/write positions of the logical ^sk controller 110. FIG. 7 includes a SCSI control portion 

disk apparatus constructed by stripe. 25 302 a disk portion 304 and a timer portion 305. In this 

Further, when there are input-output requests to a logical embodiment, the timer portion 305 is arranged in the disk 

disk apparatus constructed by stripe, instruction order of controller 110. However, it is also possible to use the system 

input-output to all of necessary disk apparatus is controlled timer 103 connected to memory bus 104 in place of the timer 

in order to finish input-output operation as the logical disk ^ portion 305. 

apparatus within the shortest period of time, considering the First operations of disk driver 205 and disk controller 110 

distance from the present disk head position to the access uc ^[^^ & follows. Disk controller 110 can receive 

position requested and input-output size having advanta- valuctobc5Ct in timer portion 305 (a part of time 

geous performance of disk apparatus constructed by each ^ ^^c^t by SCSI bender unique command 

stripe. Therefore, efficiency of the input-output operation (a pan of time limit arrangement means), from disk driver 

unproved as the logical disk apparatus. m y{& bus adapter 109t ^ SCSI control portion 302. 

Further, since elements such as WAY number and stripe After the time-out value is set in a count register (not shown 

width comprising a logical disk apparatus are generated ^ HG 7) j^de mc t ^ raa ^ the command, the timer starts 

automatically, burdens of a system manager is reduced, and counting down as soon as it receives read or write command 

the most suitable logical disk is constructed easily. w ft om m initiator (generator of command). When the count 

PMRnniMPNT 9 register indicates zero, timer portion 105 stops counting 

fiMnuuuvuini z down ^ reports t0 SCSI control portion 302 that the 

This embodiment carries out a plurality of input -output counting value became zero. Receiving this report SCSI 

operations together to the same disk apparatus, when an control portion 302 asserts abort lines of controller internal 

input-output request is sent by the input-output size indud- 43 bus. What kind of phase it might be. disk control portion 304 

ing a plurality of chunks to the logical disk apparatus records a interruption time phase and transferred d a t a quan- 

constructed as explained in the first embodiment This tity into an internal register (not shown in FIG. 7) which 

embodimect is based on the assumption that input-output could be referred from SCSI control portion 302 before it 

performance of logical disk apparatus could be improved as stops the operation. If SCSI control portion 302 is discoa- 

a whole, if the input-output operation to the logical disk 30 nected from SCSI bus 11Z it moves to a status phase after 

apparatus can finish earlier, by sending a plurality of Input- reconnecting from the initiator, and sends the transferred 

output instructions to the same disk apparatus together at data quantity (status information) as a message. If the rimer 

one time, when an input-output request is sent by the is still in operating when the whole data transfer is 

input-output size including a plurality of chunks to the completed. SCSI control portion 302 resets the count reg- 

logical disk apparatus constructed by stripe, M ister to zero. 

However, according to disk apparatus constructing a Next, operations of this embodiment are explained in 

logical disk apparatus, and a priority determination system detail. Disk driver 205 operates as follows when the time-out 

of the input-output bus connected to the disk apparatus, or value is set When demanding time-out operation at data 

characteristics of device driver which drives the disk transfer to disk driver 205. user program of disk driver 205, 

apparatus, it does not always improve the whole perfor- 60 that is. data transfer driver 203, or volume manager 204 in 

mance to send a plurality of input-output instructions to the this embodiment, sets a time-out value by using the field of 

same disk apparatus together. In such case, input-output time block device control table which is common to the system (a 

for a logical disk apparatus is improved by inserting a table for controlling a device which transfers data by block 

synchronizing point which synchronizes operation between unit as disk apparatus do), where disk driver 205 could be 

disk apparatuses positively according to the necessity. « used freely when disk driver 205 calls data transfer subrou- 

In this embodiment, a data chaining function is needed tine. When the time-out value is set. disk driver 205 trans- 
si nee a plurality of input-output instructions are gathered mits the time-out value and makes a request for sending the 
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above mentioned bender unique command to SCSI bus 
adaptor 109 before disk driver 205 makes a request for 
sending a read/write command to SCSI bus adaptor 109. 
After a request for sending and the read/write command to 
SCSI bus adaptor 109 is completed, the disk driver 205 
becomes a sleeping condition until the completion of pro- 
cessing is notified by interruption from SCSI bus adaptor 
109. After being notified the completion of processing, disk 
driver 205 resets the time-out value in the above mentioned 
control table to zero, and reports the transferred data quan- 
tity as the data transfer quantity (status data) which is 
prepared in the control table. 

Next, a concrete example of the eihbcdiment which is 
applied to a logical disk apparatus constructed by stripe is 
explained as follows. 

By the way. an input-output processing system in the 
present embodiment uses a disk apparatus constructed by 
stripe in cider to accelerate the processing speed by control 
program 201. provides high-speed device 108 with data, and 
writes an output date from high-speed device 108 into a 
logical disk apparatus. In order to utilize the processing 
ability of nigh- speed device 109 effectively, it is considered 
that reducing latency before processing starts is more sig- 
nificant than data quantity actually transferred, especially 
when the data transfer is started. Since a large quantity of 
data is originally dealt with, the whole performance Is not 
influenced by increase the number of input-output operation. 

Control program 201 provides a time-out value and 
input-output size to a logical disk apparatus constructed by 
stripe via data transfer driver 203 as required. When the 
volume manager 204 receives the time-out value via the 
above mentioned block device control table, the volume 
manager 204 sets the time-out value received from data 
transfer driver 203 into a time-out field of the control table 
prepared in each disk apparatus, when input and output of 
the disk apparatus constructed by stripe is started. 

FIG. 8 shows a state such that the volume manager 204 
synchronizes with the four disk apparatus after time-out is 
happened and receives data transfer quantity from each disk 
apparatus, when the volume manager 204 sets a tame-out 
value and reads out the logical disk apparatus constructed by 
stripe using four disk apparatus. FIG. 8 includes a first disk 
apparatus 901. a second disk apparatus 902. a third disk 
apparatus 903 and a fourth disk apparatus 904. The vertical 
direction of respective disk apparatuses shows a data trans- 
fer flow of each disk apparatus. The horizontal dotted line 
shows data quantity of one chunk of the logical disk. 
Accordingly, in this example, four chunks of data has been 
transferred to a logical disk apparatus constructed by stripe 
as an input-output size. 

In an input -output processing system of the present 
invention, a plurality of instructions to the same disk appa- 
ratus could be gathered if possible and effective, as already 
explained in the second embodiment. In this FIG. 8. four 
chunks of input-output instructions are transferred together 
to each disk apparatus, and a time-out value is transferred to 
the input -output instructions. 

As already explained, disk driver 205 and SCSI bus 
adaptor 109 have data tuning functions. Therefore, mere is 
no problem even if memory address of destination or source 
is not continuous (in FIG. 8, assuming mat flat surface is 
memory spaces, men the horizontal direction is continuous). 

Transfer state of each disk apparatus 901-904 shown in 
FIG. 8. after volume manager 204 synchronized with the 
four disk apparatuses, are not filled except the fourth disk 
apparatus 904 after time-out is happened. However, since 
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any disk apparatus has completed data transfer until the third 
chunk shown in 905. the volume manager 204 reports to the 
data transfer driver 203 that three chunks of each data 
transfer have been completed. This embodiment shows that 

5 there is a possibility that data quantity becomes zero if the 
time-out setting is too short Therefore, it is effective for 
users to give a plurality of chunks of logical disk apparatus 
which sends input -output request as an input-output size, 
and give a proper corresponding time-out value. 

10 Although, the input-output operation is completed by 
setting time limit in this embodiment, almost all data trans- 
fers are synchronized as shown in FIG. 8. 

When read/write is carried out by input-output size of a 
plurality of chunks, if gathering inputs and outputs to the 

15 same disk apparatus constructed by stripe together is not 
earned out because of its effectiveness . volume manager 204 
synchronizes with every chunks one by one between disk 
apparatus. When input-output instructions arc sent to every 
disk apparatus, an upper appointed time-out value is set 

20 However, when the input-output is sent after the second 
chunk, and the passing time is measured by a system timer, 
from the time when sending the input-output instruction of 
the first chunk to the time when completing the synchroni- 
zation. The passing time is reduced from the time-out value. 

^ and the reduced time-out value is set as a time-out value 
when the input-output instruction of the second chunk is 
sent. Hereafter, this is repeated until the time-out value 
becomes zero. An example of time setting by using SCSI 
bender unique command is shown in this enu^odiment 

x However, bender unique command docs not always have to 
be used, nor does SCSI interface. 

As described above, in the present embodiment, a time 
limit setting means are arrange which sets time limit of 
input-output operation, and a completion means are 

33 arranged which complete die input-output operation when 
the time limit is over. If the predetermined time limit is over 
before completion of input-output operation of the prede- 
termined data quantity, then input-output operation is 
completed, and transferred data quantity during the limited 

40 time is reported at the time. Therefore, processing delay 
caused by waiting for input-output synchronization could be 
avoided. 

The input-output control apparatus includes timer 
function, a means for completing input-output operation. 

43 and a status report means at input-output interrupting (report 
of transferred data quantity). Therefore, the function is 
easily realized. 

In a logical disk apparatus comprising a plurality of disk 
apparatuses, in case of operating these plurality of disk 

30 apparatus in parallel, synchronization of completion of 
input-output between the disk apparatus is taken by setting 
time limit of input-output operation. Thereby, input-output 
processing for this logical disk apparatus can be completed 
until the time desired by a user. In other words, the system 

55 sends input-output instruction to each disk apparatus com- 
prising a logical disk apparatus by setting the same time 
limit given from the upper process. When the input -output 
. operation is completed and response is received from every 
disk apparatus, input-output operation completion statuses 

60 of respective disk apparatuses are reported together to the 
upper process. Thereby, although completion states of input- 
output operation of respective disk apparatus may be dif- 
ferent from each other, however, it is possible to complete 
input-output operation for a logical disk apparatus until the 

65 time user desired. . 

Further, a synchronizing method explained in this 
embodiment is applied to a logical disk apparatus con- 
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strutted by stripe, as a synchronizing method between a 
plurality of disk apparatus. Thereby, the response time of 
input-output operation to a logical disk apparatus is guar- 
anteed by the time limit. At the same time, if small difference 
of completion status of input-output operation is set that is. 5 
appropriate data quantify according to the response time is 
set. input-output operation gets more likely to be completed 
in every disk apparatus. 
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This embodiment provides further realization for high- 
speed processing in an input-output processing system of the 
present invention. In the fourth embodiment, a mechanism 
for carrying out high-speed data transfer between input- 
output units is explained, in the input-output systems 15 
explained in the above mentioned embodiments 1-3. 

Conventionally, when data are transferred between the 
input-output units, data is read from the transferring source 
into data space of user program which carries out data 
transfer control, and the read data is written in the apparatus 
of the transfer destinations. The memory space of user 
program, controlled by the memory of the system, which has 
given a burden to memory control of the system during the 
data transfer. DMA operation sometimes may not be carried 
out from the devices at the data space of user program, 
which has made the data transfer rate decreased. It is 
possible to prevent performance deterioration caused by 
giving burden to memory control, by transferring the data of 
this memory space by allocating buffer memory area, to 
which the data is transferred, and which is not concerned 
with the system, on a physical memory of the system, and by 
controlling from control process which is carried out in 
kernel space. 

FIG. 9 shows an example of bus arbiter 105 for realizing 35 
this high-speed transfer between devices which do not 
contain logical disks. In FIG. 9, source register 401 and 
destination register 442 are respectively are mapped at the 
memory space on the system, where ID on local bus 107 of 
high-speed device 108 is registered. Bus arbitration logic ^ 
circuit 403 controls local bus. 

FIG. 10 is a flow chart which shows a process how data 
transfer driver 203 carries out high-speed transfer between 
input-output units except a logical disk. FIG. 11 is a flow 
chart which shows a process how bus arbiter 105 carries out 4S 
high-speed transfer between input-output units Including a 
logical disk. 

When control program 201 supplies a transfer source 
device, classification thereof, a data offset, a transfer desti- 
nation device, a classification thereof, a data offset, and so 
transfer data quantity, the data transfer driver 203 operates 
according to the logic shown in a flow chart of FIG. 10. The 
operation of the data transfer driver 203 is explained refer- 
ring to FIG. 10. In step 1001. data transfer driver 203 judges 
whether logical disk apparatus is included or not in the 53 
source device or destination device, by the device classifi- 
cation (any of an ordinary device, a high-speed device, or a 
logical disk apparatus) received from control program 201. 
If a logical disk apparatus is included, a data transfer 
function of bus arbiter 105 is not utilized because an « 
interface is not arranged in bus arbiter ItS of a volume 
manager and disk driver. I nstead, the device is controlled in 
step 1004 utilizing memory 102 in the system which is 
controlled by data transfer driver 203. and device drivers 
(disk driver 205. high-speed device driver 206). 65 

A process shown in step 1004 is explained as follows. By 
the way. data transfer driver 203 keeps a static memory area 



for data transfer processing on the system. This memory is 
kept as a page, for example, continuously on 2 MB physical 
address, which is not an object for paging, when the system 
is Initialized, and mapped on a virtual space as a kernel 
space. It is utilized for a two-face buffer to carry out data 
transfer. If the chunk size of a logical disk apparatus 
constructed by stripe is 128 KB. it is possible to assign an 
input-output size of 8 chunks. This processing is character- 
ized by carrying out asynchronous double buffer processing 
in one context provided by control program 201. A logical 
disk apparatus is controlled by volume manager 204. and 
controls input and output of a plurality of disk apparatus 
having asynchronous interface. However, as already 
explained, asynchronous devices are mutually synchronized 
because the input and output result of logical disk apparatus 
must be synchronously returned to the user. Therefore, when 
data transfer driver 203 accesses to a logical disk apparatus, 
it has to call and wait until the processing is completed. 
Accordingly, in order to carry out asynchronous operation of 
the two-face buffer, this processing is carried out as follows. 



[When the transfer source is a logical disfcj 

Reading from logical disk apparatus to the fist buffer 

-* Write* from the Erst buffer to a highspeed device (asynchronous) 
I A 

I ftwfi pg from logical didc apparatus 10 I tbe t*m\iri buffer 
tl I 

I o Waiting ft* writing from die first buffer to the high-speed device 

10 I 

I p Writing from the second buffer to (he high-speed device 

(Mynchrooous) 
I I 

I p—itwifl ftcta logical disk to the firs buffer 

14- 

4- Wailing far writing trots the second buffer to the high-speed device 
[When the transfer source is a high-speed device] 

pi-«^T>g ban the high-speed device to the first buffer (as y nc hro noo*) 

Waiting for reading from the high-speed device to the first buffer 

pMrfmfl from me highspeed device to the secood buffer 
(asynchronous) 
I i 

I Writing from the first buffer to the logical disk 

11 i 

I o Wailing far reading from the highspeed device to the second buffer 
lo i 

I p Reading^ from (be high-speed device to the first buffer (asynchronous) 

I Writing from the second buffer 10 the logical disk 

I I 

*~ Willing for reading from 0 highspeed device to the tecood tmffer 



On the other hand. If a data transfer object device does not 
include any logical/disk apparatus, that is. data transfer 
between high-speed devices or data transfer between disk 
apparatus in which disk striping is not effective and high- 
speed devices. In this case, data transfer is carried out using 
bus arbiter 115 having data transfer function. The operation 
of this case is explained as follows. In step 1002, in case 
where data transfer driver 203 judges that there is no 
difference of transfer rate between devices, that is, data 
transfer between high-speed devices, is explained as fol- 
lows. Since high-speed device 108 has DMA controller. U is 
possible to carry out data transfer without intervening me 
memory 102 by handshaking using bus arbiter 105. In step 
1005, the data transfer driver registers ID (device address) 
on the local bus 107 of high-speed device 108 which could 
be transfer source or transfer destination, to the source 
register 401 and destination register 402 of bus arbiter 105 
mapped in the memory space of the system. Further, the data 
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transfer driver sends a read request to the high-speed device 
108 of transfer source by a predetermined input-output size 
and by an address of unused physical memory space given 
by the system beforehand. The data transfer driver also 
sends a size and ao address, and sends a write request to 
high-speed device 108 of transfer destination. Thereby, bus 
acquisition requests are sent to the bus from two devices. 
Where, the ID is registered in source register 401 and 
destination register 402. Further, since any busy bit (this bit 
is explained below) of memory 102 is not set to source 
register 401 and destination register 402. the bus arbiter 105 
asserts bus grant signals after confirming the transmitting 
and receiving request signals. Thereby, data transfer is 
carried out without intervening the memory. After comple- 
tion of data transfer, data transfer driver clears a source/ 
destination register of the arbiter. 

When it is judged that there is no difference of transfer 
rate between devices in step 1002. in other words, when 
either of disk apparatus 111 or high-speed device 108 is a 
transfer source and the other is a transfer destination, data 
transfer driver 203 keeps area of memory 102. which is 
continuous but not divided into another pages on the physi- 
cal address, for input-output size. In step 1003, each ID and 
memory use bit flag are logically ORed and registered on 
source register 401 and destination register 402 of bus 
arbiter 105. After then, in step 1005. data transfer driver 203 
gives transfer data size and reserved memory address to both 
devices of source register 401 and destination register 402. 
and sends read/write requests to both of them. Since there is 
an indication of memory use in step 1102. bus arbiter 105 
asserts bus grant signals for the bus request from the source 
device in step 1103. and sends a bit. which means data 
transfer completion of source device, to source register 401. 
The device of transfer destination repeats sending bus 
request until the bus request is accepted. When bus arbiter 
105 confirms that data transfer completion bit is ON for bus 
free and at the data transfer completion of the source device, 
bus arbiter 105 asserts grant signals in step 1106. Thereby, 
me destination device writes data of the transfer source, 
which is transferred to memory 102 reserved for transfer, 
into own device. Thereby, the data transfer is carries out 
After the data transfer is completed, data transfer driver 203 
clears source register 401 and destination register 402 in bus 
arbiter 105. and releases memory 102 which has been 
reserved for data transfer. 

As described above, since a data transfer driver carries out 
data transfer by keeping two-face buffer for data transfer, it 
is possible to cany out faster data transfer between input- 
output units. That is. it is possible to provide a data transfer 
system which gives little burden to memory control of the 
system, in other words, which can prevent performance 
degradation caused by a burden on the memory control of 
the system. 

Further, in this embodiment, the system controls DMA 
controller arranged in a control unit which controls each 
input-output unit, and providing a controller at main input- 
output bus. the controller synchronizes data transfer of 
respective input-output units on the bus without via any 
memory. Therefore, it is possible avoid to transfer data to 
memory unnecessarily. 

In order that the control progr am which carries out data 
transfer (a data transfer driver) absorbs velocity difference 
between input-output units, the system reserves memory for 
data transfer controller (bus arbiter) connected to main 
input-output bus. Therefore, it is possible for data transfer 
controller (bus arbiter) to regulate transfer rate between 
devices utilizing the memory. Accordingly, high-speed data 
transfer between input -output units can be carried out 
efficiently. 
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EMBODIMENT 5 

This embodiment is an example of providing data transfer 
of each device on control local bus 107 which synchronizes 

5 the data transfer on the bus without via memory 102. to 
avoid useless data transfer to memory 101 This is realized 
by controlling DMA controller of each device in the input- 
output system constructed in embodiments 1-4. needless to 
say. DMA transfer rate between devices has to accord with 

1Q each other. In this embodiment a logical disk apparatus is 
often constructed to have four stripe disks in each different 
disk apparatus on each SCSI bus. However, arrangement of 
a logical disk apparatus often needs to be changed in order 
to arrange data. If a disk apparatus corresponding to data 

i5 transfer destination is on the same SCSI bus. volume man- 
ager 204 sends copy command of SCSI to all stripe disk 
apparatuses as a copy destination. This makes it possible to 
change the data arrangement of a logical disk apparatus. 
This fifth embodiment is applied to. for example, a system 

2Q which is constructed by a plurality of logical disk apparatus 
constructed by stripe, where a plurality of SCSI buses are 
connected to a main input-output bus (local bus), a plurality 
of logical disk apparatus are connected to each SCSI bus. 
and every SCSI bus is connected to a single disk apparatus. 

23 In such a system, two logical disk apparatuses are con- 
nected to the same bus with the same stripe width. 
Therefore, when data transfer between logical disk appara- 
tus is required, the disk apparatus connected the same bus 
uses a copy command of SCSI between the corresponding 

30 stripes. In the system, since data transfer is carried out 
between these two logical disk apparatus, high-speed data 
transfer which gives little burden to the system can be 
realized. The embodiment can also be applied to different 
kinds of input-output units having a high-speed device 

35 which is provided by a plurality of input-output ports which 
has a SCSI controller function. 

In me above mentioned respective embodiments. SCSI is 
explained as an example of an input-output interface. 
However, an interface does not need to be SCSI especially. 

40 Needless to say. other kinds of interface could be used. 
In the fifth embodiment copying between disk apparatus 
is explained using a COPY command of SCSL However, if 
a function corresponding the COPY command of SCSI can 
be realized in disk control units, it could be used for copying 

43 between disk apparatuses. 
What is claimed is: 

1. An input-output processing system for inputting and 
outputting a large quantity of data comprising a logical disk 
control means, said logical disk control means comprising: 
a performance data collection means for collecting data 
from performance data where performance character- 
istics of a plurality of disk apparatus constructing 
input-output system are given by a system manager or 
55 from direct measurement of the performance by oper- 
ating said disk apparatus; 
a logical disk construction means for constructing a 
logical disk apparatus using said plurality of disk 
apparatus on the basis of performance data collected by 
£0 said performance data collection means. 

where said logical disk construction means forms logical 
disk management data so that the stripe width is set io 
order to equalize response time needed for input and 
output corresponding to one stripe data of each disk 
65 apparatus constructing said logical disk apparatus; and 
said logical disk control means controls said logical disk 
apparatus by said logical disk management data. 
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2. An input-output processing system for inputting and 
outputting a large quantity of data of daim 1. further 
comprises 

a time limit setting means for setting a Limit time required 
to operate said input-output unit at sending input- 5 
output instruction to input-output units: and 

a means for completion of input-output operation which is 
started by said input-output instruction when the set 
time limit passed. 

3. An input-output processing system for inputting and 10 
outputting a large quantity of data of claim 2. further 
comprises 

a timer for setting time set by said time Limit setting 
means; j} 

a means for completing input-output operation by receiv- 
ing time expiration information from the timer. 

4. An input-output processing system for inputting and 
outputting a Large quantity of data of claim 2: wherein 

said input-output unit is a logical disk apparatus com- 20 
prised of a plurality of disk apparatus. 

5. An input-output processing system for inputting and 
outputting a large quantity of data of claim 1. further 
comprises 

a time limit setting means for setting limit time in relation 25 
to a processing of an input-output instruction at sending 
input-output instruction to input-output units; and 

a processing completion means for completing input- 
output operation which is started by said input-output 
instruction when the set time limit is expired. 30 

6. An input-output processing system for inputting and 
outputting a large quantity of data of claim 1: wherein 

said performance measurement means performs an input- 
output instruction to measure the response time by 
setting conditions in which the response time for each 33 
disk apparatus is the shortest or longest 

7. An input-output processing system for inputting and 
outputting a large quantity of data of claim 1: wherein 

said performance measurement means carries out perfor- 
mance measurement of the disk apparatus as a part of 40 
initialization process to the disk apparatus when disk 
apparatus are added to the system. 

8. An input-output processing system for inputting and 
outputting a large quantity of data of daim 1: wherein 

said performance data collection means collects bus per- 
formance data by construction of input-output bus 
which a plurality of disk apparatus are connected to. 
system construction data to which the system manager 
gives bus transfer performance, or actual operation of ^ 
an input-output unit connected to said bus by said 
performance measurement means; 

said logical disk construction means constructs a logical 
disk apparatus on the basis of this collected perfor- 
mance data considering bus transfer performance con- 5J 
nected to each disk performance. 

9. An input-output processing system for inputting and 
outputting a large quantity of data of claim 1: wherein 

In said logical disk construction means, 

data transfer quantity for every input -output instruction 60 
assigned to disk apparatus comprising logical disk 
apparatus is arranged in one track of the disk apparatus. 

10. An in put -output processing system for inputting and 
outputting a Large quantity of data of daim 1: wherein 

said logical disk construction means equalizes apparent 65 
input-output performance of a logical disk apparatus no 
matter where the data is placed on the disk apparatus. 
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by alternate combination of inner and outer of circum- 
ferences of respective disk apparatus, if the logical disk 
apparatus is constructed by a plurality of homogeneous 
disk apparatus whose performance is different between 
inner and outer of circumferences. 

11. An input-output processing system for inputting and 
outputting a large quantity of data of claim 1: wherein 

when an input -output request is sent to a logical disk 
apparatus. 

said logical disk control means dynamically judges the 
state of head position of disk apparatus shortly before 
sending an input-output instruction to every disk appa- 
ratus constructing the logical disk apparatus, and sends 
the input-output instruction $0 that a time for carrying 
out input and output to the logical disk apparatus 
becomes shortest. 

12. An input-output processing system for inputting and 
outputting a Large quantity of data of claim 11: wherein 

when an Input-output request is sent to a logical disk 
apparatus using an Input-output size in which a plural- 
ity of input-output requests are generated to each of a 
plurality of disk apparatus; 

said logical disk control means judges performance char- 
acteristics of disk apparatus comprising this logical 
disk apparatus and characteristics of input-output bus 
which is connected to a disk, and decreases frequencies 
for sending a plurality of input-output instructions to 
the same disk apparatus, according to the necessity. 

13. An input-output processing system for inputting and 
outputting a large quantity of data of claim 11: wherein 

when an input-output request is sent to a logical disk 
apparatus using Input-output size in which a plurality of 
input-output requests are generated to each of a plu- 
rality of disk apparatus; 

said logical disk control means sends the input-output 
instruction so mat the input-output time to the logical 
disk apparatus becomes shortest by arranging a syn- 
chronizing point in the input-output request to these 
plurality of disk apparatuses. 

14. An input-output processing system for inputting and 
outputting a large quantity of data of claim 1: wherein 

said logical disk control means automatically constructs a 
logical disk apparatus which satisfies the performance 
given by the system manager. 

15. An input-output processing system for inputting and 
outputting a Large quantity of data of claim 1. further 
comprises 

a data transfer driver for carrying out data transfer control 

to input -output units, 
said data transfer driver carries out data transfer by 

ensuring two-face buffer on the system memory, when 

an apparatus for sending the input -output instruction 

includes a logical disk apparatus. 

16. An input-output processing system for inputting and 
outputting a Large quantity of data of claim 15. further 
comprises 

a bus arbiter for controlling input-output bus connected to 
an input-output unit; 

said bus arbiter comprises a source register and a desti- 
nation register for registering data transfer source ID 
and data transfer destination ID, respectively; and 

said data transfer driver carries out data transfer without 
using system memory by driving said bus arbiter 

when an apparatus for sending the input-output instruc- 
tion does not include a logical disk apparatus. 



08/28/2003, EAST Version: 1.04.0000 



5.761.526 



23 



17. An input-output processing system for inputting and 
outputting a large quantity of data of claim 15: wherein 

said data transfer driver drives said bus arbiter by ensur- 
ing buffer for data transfer on the system memory. 3 
when there is difference of data transfer rate between 
input units and output units which are appointed by an 
input-output instruction. 

18. An input-output processing system fox inputting and 
outputting a large quantity of data of claim 1: further 10 
comprising 
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a plurality of disk apparatus comprising said logical disk 
apparatus which are connected to different input-output 
buses to construct a plurality of logical disk apparatus, 
further comprising 

a copy means arranged between disk apparatuses con- 
nected to the same input-output bus at a disk control 
apparatus for controlling respective disk apparatus 
connected to said respective input-output buses, in 
order to carry out data copy between different logical 
disk apparatus. 
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